SemanticScuttle - klotz.me » klotz: retrieval augmented generation

klotz: retrieval augmented generation*

kotaemon: Open-source RAG UI for chatting with your documents

An open-source project offering a functional RAG UI for document QA, suitable for both end-users and developers. It supports various LLM providers, is customizable, and offers multi-modal QA, citations, and complex reasoning methods.

2024-10-13 Tags: kotaemon, rag, ui, document, qa, github, python, gradio by klotz

Search for a Plug-and-Play RAG Solution for Large Language Models

Discussion in r/LocalLLaMA about finding a self-hosted, local RAG (Retrieval Augmented Generation) solution for large language models, allowing users to experiment with different prompts, models, and retrieval rankings. Various tools and resources are suggested, such as Open-WebUI, kotaemon, and tldw.

2024-10-13 Tags: localllama, llm, rag, self-hosted, reddit by klotz

Using query intent to boost retrieval results

This article discusses the importance of determining user query intent to enhance search results. It covers how to identify search and answer intents, implement intent detection using language models, and adjust retrieval strategies accordingly.

2024-10-13 Tags: query intent, llm, search, rag by klotz

New Technique Makes RAG Systems Much Better at Retrieving the Right Documents

Researchers from Cornell University developed a technique called 'contextual document embeddings' to improve the performance of Retrieval-Augmented Generation (RAG) systems, enhancing the retrieval of relevant documents by making embedding models more context-aware.

Standard methods like bi-encoders often fail to account for context-specific details, leading to poor performance in application-specific datasets. Contextual document embeddings address this by enhancing the sensitivity of the embedding model to subtle differences in documents, particularly in specialized domains.

The researchers proposed two complementary methods to improve bi-encoders:

Modifying the training process using contrastive learning to distinguish between similar documents.
Modifying the bi-encoder architecture to incorporate corpus context during the embedding process.

These modifications allow the model to capture both the general context and specific details of documents, leading to better performance, especially in out-of-domain scenarios. The new technique has shown consistent improvements over standard bi-encoders and can be adapted for various applications beyond text-based models.

2024-10-10 Tags: rag, embedding, document retrieval, llm by klotz

Scaling RAG from POC to Production

The article discusses the challenges and components required to scale Retrieval Augmented Generation (RAG) from a Proof of Concept (POC) to production. It covers key issues such as performance, data management, risk, integration into workflows, and cost. It also outlines architectural components such as scalable vector databases, caching mechanisms, advanced search techniques, responsible AI layers, and API gateways needed for overcoming these challenges.

2024-10-08 Tags: rag, performance, production engineering by klotz

Using Redis for Real-Time RAG Goes Beyond a Vector Database

This article discusses the importance of real-time access for Retrieval Augmented Generation (RAG) and how Redis can enable this through its real-time vector database, semantic cache, and LLM memory capabilities, leading to faster and more accurate responses in GenAI applications.

2024-10-07 Tags: redis, rag, real-time, genai, vector database, semantic cache, llm, memory by klotz

How to Use HyDE for Better LLM RAG Retrieval

Dr. Leon Eversberg explains how to improve the retrieval step in RAG pipelines using the HyDE technique, making LLMs more effective in accessing external knowledge through documents.

2024-10-05 Tags: hyde, llm, rag, document retrieval by klotz

Discovering Semantic Search and RAG with Large Language Models (LLMs)

Foundational concepts, practical implementation of semantic search, and the workflow of RAG, highlighting its advantages and versatile applications.

The article provides a step-by-step guide to implementing a basic semantic search using TF-IDF and cosine similarity. This includes preprocessing steps, converting text to embeddings, and searching for relevant documents based on query similarity.

2024-10-04 Tags: llm, semantic search, rag, nlp, embeddings, asymmetric by klotz

Rerank API Documentation

This page provides documentation for the rerank API, including endpoints, request parameters, and response formats.

2024-09-28 Tags: api, jina, reranker, llm, rag, jina ai, search, relevance, l3 by klotz

Jina AI Reranker

Maximize search relevancy and RAG accuracy with Jina Reranker. Features include multilingual retrieval, code search, and a 6x speedup over the previous version.

2024-09-28 Tags: reranker, jina ai, search, relevance, rag, l3 by klotz

SemanticScuttle - klotz.me

klotz: retrieval augmented generation*

Linked Tags

Related Tags